Multi-class support vector machines for protein secondary structure prediction.

نویسندگان

  • Minh N Nguyen
  • Jagath C Rajapakse
چکیده

The solution of binary classification problems using the Support Vector Machine (SVM) method has been well developed. Though multi-class classification is typically solved by combining several binary classifiers, recently, several multi-class methods that consider all classes at once have been proposed. However, these methods require resolving a much larger optimization problem and are applicable to small datasets. Three methods based on binary classifications: one-against-all (OAA), one-against-one (OAO), and directed acyclic graph (DAG), and two approaches for multi-class problem by solving one single optimization problem, are implemented to predict protein secondary structure. Our experiments indicate that multi-class SVM methods are more suitable for protein secondary structure (PSS) prediction than the other methods, including binary SVMs, because their capacity to solve an optimization problem in one step. Furthermore, in this paper, we argue that it is feasible to extend the prediction accuracy by adding a second-stage multi-class SVM to capture the contextual information among secondary structural elements and thereby further improving the accuracies. We demonstrate that two-stage SVMs perform better than single-stage SVM techniques for PSS prediction using two datasets and report a maximum accuracy of 79.5%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Estimating the Class Posterior Probabilities in Protein Secondary Structure Prediction

Support vector machines, let them be bi-class or multi-class, have proved efficient for protein secondary structure prediction. They can be used either as sequence-to-structure classifier, structure-to-structure classifier, or both. Compared to the classifier most commonly found in the main prediction methods, the multi-layer perceptron, they exhibit one single drawback: their outputs are not c...

متن کامل

Protein Secondary Structure Prediction with Support Vector Machines

In this paper, a method for secondary structure with support vector machines is presented. The system used two layers of support vector machines, with a weighted cost function to balance the uneven class memberships. Using this method, prediction accuracy reaches 71.5%, comparable to the best techniques avaliable.

متن کامل

Two-Stage Multi-Class Support Vector Machines to Protein Secondary Structure Prediction

Bioinformatics techniques to protein secondary structure (PSS) prediction are mostly single-stage approaches in the sense that they predict secondary structures of proteins by taking into account only the contextual information in amino acid sequences. In this paper, we propose two-stage Multi-class Support Vector Machine (MSVM) approach where a MSVM predictor is introduced to the output of the...

متن کامل

Protein Secondary Structure Prediction Using Support Vector Machines and a New Feature Representation

Knowledge of the secondary structure and solvent accessibility of a protein plays a vital role in the prediction of fold, and eventually the tertiary structure of the protein. A challenging issue of predicting protein secondary structure from sequence alone is addressed. Support vector machines (SVM) are employed for the classification and the SVM outputs are converted to posterior probabilitie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome informatics. International Conference on Genome Informatics

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2003